Date created: June 10, 2022
Author: P. Alexander Burnham
Summary Last time we set up several cubes from geotif files and explored some of the basic querrying and cube building functionality of cubetime. However, all of the files we were working with, had the same scales and grid sizes. Spacetime has the functionality to rescale rasters of different extents, grid sizes, epsg codes etc. and create a cleaned cube where all of those factors are consistent accross the data layers.
Let’s load two rasters of data from India. We can then load the data up as a file object and compare some attributes between them.
# read files and load data
data = ["demoData/LULC_1995.tif", "demoData/India_cropped-area_1km_2016.tif"]
ds = read_data(data)
# check number of bands
ds.number_of_bands()
# check the epsg codes
## [1, 1]
ds.extract_epsg_code()
# check the raster dimensions
## ['EPSG:32644', 'EPSG:4326']
ds.get_raster_dimensions()
# check pixel size
## [(29484, 33555), (2373, 3269)]
ds.pixel_size()
# check no data vlaues
## [100.0, 0.008868160000000002]
ds.get_nodata_value()
## [127.0, -1.7e+308]
Let’s plot the two files and compare them.
# Get the array data
arrays = ds.get_raster_data()
# plot file 1
plt.imshow(arrays[0], vmin=0, vmax=17)
## <matplotlib.image.AxesImage object at 0x7f85dc359820>
plt.show()
# plot file 2
plt.imshow(arrays[1], vmin=0, vmax=100)
## <matplotlib.image.AxesImage object at 0x7f85b89f8df0>
plt.show()
Apart from the fact that both datasets are from approximately the
same region, we can see that almost all of the attributes between these
two files are different. Using raster_scale() we can get
these two files to have the same spatial reference system,, pixel size
and nodata value with cells aligned to the same lat and lon values.
# lets align these files
alignedDS = raster_align(data=ds, resolution=.08, SRS=4326, noneVal=127)
Let’s look at these attributes now that we have run the align function.
# check epsg codes
alignedDS.extract_epsg_code()
# get raster dimensions
## ['EPSG:4326', 'EPSG:4326']
alignedDS.get_raster_dimensions()
# pixel size
## [(409, 383), (264, 363)]
alignedDS.pixel_size()
# check no data values
## [0.08, 0.08]
alignedDS.get_nodata_value()
## [127.0, 127.0]
Lets also look at the rescaled images.
# get rescaled data
scaledArray = alignedDS.get_raster_data()
# plot file 1
plt.imshow(scaledArray[0], vmin=0, vmax=17)
## <matplotlib.image.AxesImage object at 0x7f85b8a46ac0>
plt.show()
# plot file 2
plt.imshow(scaledArray[1], vmin=0, vmax=100)
## <matplotlib.image.AxesImage object at 0x7f85dc4a2100>
plt.show()
Let’s upsale the lower resolution raster using the default nearest r algorithm
# finest grain resolution
alignedDS_max = raster_align(data=ds, resolution="max")
alignedDS_max.pixel_size()
# plot the file
## [100.0, 100.0]
scaledArray_max = alignedDS_max.get_raster_data()
plt.imshow(scaledArray_max[1], vmin=0, vmax=100)
## <matplotlib.image.AxesImage object at 0x7f8448096af0>
plt.show()
Next we will downscale the higher resolution image using the “min” option using the same algorithm
# coarse resolution
alignedDS_min = raster_align(data=ds, resolution="min")
alignedDS_min.pixel_size()
# plot the file
## [957.302191144903, 957.302191144903]
scaledArray_min = alignedDS_min.get_raster_data()
plt.imshow(scaledArray_min[0], vmin=0, vmax=17)
## <matplotlib.image.AxesImage object at 0x7f85dc34b550>
plt.show()
Intersection We can see now that the only difference is the extent of the files. The first file covers more surface area of the globe. Lets crop the files to have the same greatest common bounding box, (intersection).
# trim the file object
trimmedDS_intersection = raster_trim(alignedDS, method = "intersection")
# lets compare the dimensions of the new objects
trimmedDS_intersection.get_raster_dimensions()
## [(264, 363), (264, 363)]
Comparing the plotted images reveals that the files are now propperly aligned and trimmed
# get data
trimmedArray = trimmedDS_intersection.get_raster_data()
# plot file 1
plt.imshow(trimmedArray[0], vmin=0, vmax=17)
## <matplotlib.image.AxesImage object at 0x7f85b89e6550>
plt.show()
# plot file 2
plt.imshow(trimmedArray[1], vmin=0, vmax=100)
## <matplotlib.image.AxesImage object at 0x7f84480a73a0>
plt.show()
Union We can also chose to crop the files so that no data are lost. However, no data values will populate the areas that do not exist on certain rasters.
# trim the file object
trimmedDS_union = raster_trim(alignedDS, method = "union")
# lets compare the dimensions of the new objects
trimmedDS_union.get_raster_dimensions()
## [(409, 383), (409, 383)]
Let’s look at these new files which should have the larger extent
# get data
trimmedArray = trimmedDS_union.get_raster_data()
# plot file 1
plt.imshow(trimmedArray[0], vmin=0, vmax=17)
## <matplotlib.image.AxesImage object at 0x7f85b8a3b790>
plt.show()
# plot file 2
plt.imshow(trimmedArray[1], vmin=0, vmax=100)
## <matplotlib.image.AxesImage object at 0x7f85dc49b640>
plt.show()
Specified Points We can also crop our files using user
defined upper left and lower right coordinates. These are the mimum
required values for specifying a bounding box.
# trim the file object
trimmedDS_box = raster_trim(alignedDS, method="corners", ul = [73, 25], lr = [81, 13])
# get dimensions
trimmedDS_box.get_raster_dimensions()
## [(100, 150), (100, 150)]
Let’s look at these files, next.
# get data
trimmedArray = trimmedDS_box.get_raster_data()
# plot file 1
plt.imshow(trimmedArray[0], vmin=0, vmax=17)
## <matplotlib.image.AxesImage object at 0x7f8428f4b580>
plt.show()
# plot file 2
plt.imshow(trimmedArray[1], vmin=0, vmax=100)
## <matplotlib.image.AxesImage object at 0x7f84480b6430>
plt.show()
Shape Files We can also use a shape file to crop these
images. I am using the higher resolution cube and New Delhi’s shape
file.
# trim the file object
trimmedDS_shape = raster_trim(alignedDS_max, method="shape", shapeFile="demoData/DelhiShape/Districts.shp")
# get dimensions
trimmedDS_shape.get_raster_dimensions()
## [(495, 530), (495, 530)]
Let’s plot the result if the shape file cropping method.
# get data
trimmedArray = trimmedDS_shape.get_raster_data()
# plot file 1
plt.imshow(trimmedArray[0], vmin=0, vmax=17)
## <matplotlib.image.AxesImage object at 0x7f8428e17640>
plt.show()
# plot file 2
plt.imshow(trimmedArray[1], vmin=0, vmax=100)
## <matplotlib.image.AxesImage object at 0x7f84480c3d30>
plt.show()
Now that we have cleaned data sets, we can assemble them into a cube
as shown in the previous vignette. Lets start be creating a time object.
cube_time() allows you to create a time vector that skips
user-specified chunks of time using the skips arguemnt.
# create time vec
yearObj = cube_time(start="1995", length=2, scale = "year", skips = 21)
We will now create a cube by assigning files and bands to time.
cube = make_cube(data = trimmedDS_shape, fileName = "indiaCube.nc4", organizeFiles = "filestotime", organizeBands="bandstotime", timeObj = yearObj)
Finally, lets look at the data
cube.get_raster_data()
<xarray.DataArray (lat: 530, lon: 495, time: 2)>
array([[[127., 127.],
[127., 127.],
[127., 127.],
...,
[127., 127.],
[127., 127.],
[127., 127.]],
[[127., 127.],
[127., 127.],
[127., 127.],
...,
[127., 127.],
[127., 127.],
[127., 127.]],
[[127., 127.],
[127., 127.],
[127., 127.],
...,
...
...,
[127., 127.],
[127., 127.],
[127., 127.]],
[[127., 127.],
[127., 127.],
[127., 127.],
...,
[127., 127.],
[127., 127.],
[127., 127.]],
[[127., 127.],
[127., 127.],
[127., 127.],
...,
[127., 127.],
[127., 127.],
[127., 127.]]], dtype=float32)
Coordinates:
* lon (lon) float32 9.33e+04 9.34e+04 9.35e+04 ... 1.426e+05 1.427e+05
* lat (lat) float32 3.201e+06 3.201e+06 3.201e+06 ... 3.148e+06 3.148e+06
* time (time) datetime64[ns] 1995-12-31 2016-12-31